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Time domain wateimaiking of multimedia signals EPO - DQ 1 

2 8. 03. 2002 
(75) 



The present invention relates to apparatus and methods for encoding and 
decoding information in multimedia signals, such as audio, video or data signals. 



Watermarking of multimedia signals is a technique for the transmission of 
additional data along with the multimedia signal. For instance, watermarking techniques can 
be used to embed copyright and copy control information into audio signals. 

The main requirement of a watermarking scheme is that it is not observable 
(i.e. in the case of an audio signal, it is inaudible) whilst being robust to attacks to remove the 
watermark from the signal (e.g. removing the watermark will damage the signal). It will be 
appreciated that the robustness of a watermark wiU normally be a trade off against the quality 
signal in which the watermark is embedded. For instance, if a watermark is strongly 
^ .i)d into an audio signal (and is thus difficult to remove) then it is likely that the 
. j/ruixy of the audio signal wiU be reduced. 

Various types of audio watermarking schemes have been proposed, each with 
its own advantages and disadvantages. For instance, one type of audio watermarking scheme 
is to use temporal correlation techniques to embed the desired data (e.g. copyright 
information) into the audio signal. This technique is effectively an echo-hiding algorithm, in 
which the strength of the echo is determined by solving a quadratic equation. The quadratic 
equation is generated by autOKSorrelation values at two positions: one at delay equal to t, and 
one at delay equal to 0. In such a scheme, as echoes of the audio signal are added to the 
original audio signal, tiie resulting signal is in fact both an amplitude and a phase modulated 
version of the original audio signal.. At the detector, the watermark is extracted by 
determining the ratio of the auto correlation function at the two delay positions. 

This correlation technique has a number of drawbacks. For instance, it is only 
possible to embed the watermark where the resulting quadratic equation has real roots, and 
consequentiy this reduces the robustness (ability of the wateimark to withstand attacks) for a 
given audio quaUty. Further, fixe performance of flie correlation algorithm is dependent upon 
the value of the delay t and the characteristics of the original signal. This is a significant 
drawback. 
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Also known are watermarking schemes based on Ihe ampUtude modulation of DFT (Discrete 
Fourier Transfonn) coefficients. As such schemes require liie calculation of DFTs at bolli the 
encoder and the decoder, the resulting hardware for implementing such DFT schemes tends 
to be relatively complex, and hence the scheme tends to be slow to perform and costly. 
5 Further, watermarks cannot be satisfactorUy embedded in audio segments that have sparse I 
frequency characteristics, and hence the DFT scheme does not work well with particular ■ 
types of music. 

WO 00/00969 describes an alternative technique for embeddmg or encoding 
10 auxiUary signals (such as copyright information) into a muitmiema nost or cover signal. A 
replica of the cover signal, or a portion of the cover signal in a particular domain (time, 
frequency or space), is generated according to a stego key. which specifies modification 
values to the parameters of tiie cover signal. The replica signal is tiien modified by an 
auxiHary signal corresponding to the information to be embedded, and inserted back mto tiie 
15 cover signal so as to form the stego signal. 

At the decoder, in order to extract the original auxiliary data, a reH- ■ ■ - ^ 
stego signal is generated in the same mamier as the replica of the original cc ; 

requires tiie use of the same stego key. The resulting replica is then correiato 

received stego signal, so as to extract tiie auxiUary signal. The extraction of tiae ad>vi-.ii.ir 
20 signal is tiius relatiively complex, and requires tiie stego key at botii flie encoder (or 

embedder) and de«)der (of detector). Additionally, a brute force search is required to 

synduronize to the auxiliary signal at the detector. 

Further, performance of tiie payload extraction is dependent on how wfell tiie 

auxiliary signal can be estimated. In a system with a high expected error rate of tiie payload 
25 bits in tiie auxiliary signal, tiiis is very difficult to achieve. Solutions would lead to very 

complex error correction metiiods, or significanfly limit tiie information capacity. 

It is an object of tiie present invention to provide a watermarking scheme tiiat 
substantially addresses at least one of the problems of the prior art. 
3Q In a first aspect, tiie present invention provides a method of generating a 

watermark signal for embeddmg m a multimedia signal, tiie metiiod comprising tiie steps of: 
(a) generatmg two sequences of values, tiie second sequence bemg a circularly shifted 
version of tiie first sequence; and (b) generating a watermark signal by adding tiie values of 
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the first sequence to the respective values in the corresponding positions of the second 
sequence. 

Preferably, each value of the first and second sequences is represented by a 
pulse of preferable width Ts so as to form rectangular wave signals. 
5 Preferably, in step (a) a window shaping fiinction is applied to convert each of 

the rectangular signals into respective smoothly varying signals, vnOx the resulting smoothly 
varying signals being added in step (b) to form the watermark signal. 

Preferably, each one of said sequences of values is convolved with a window 
shaping fiinction which has a width of at least Ts^ so as to generate two smoothly varying 
10 signals, these smoothly varying signals being added togetiier in step (b) so as to form the 
watermark signal. 

Preferably, said window shaping fimction has a band limited firequency 
behavior and a smooth temporal behavior. 

Preferably, said window shaping fimction has a synometric or anti-symmetric 
1 5 temporal behavior. 

Preferably, said window shaping fiinr^ ' . ; . \. * 'as at least one of a raised^ 
cosine fimction and a bi-phase fimction. 

Preferably, the watermark signal is v. . iie addition of the two r J 

smoothly varying signals with a relative delay of Tr, where Tr < Tg. 
20 Preferably, Tr is chosen such that maximum amplitude points of tiie first 

smoothly varying signal coincide with zero-crossings of the second smootiUy varying signal, 
and vice-versa. 

Preferably, said watermark signal has a payload that is encoded in the 
combination of said two sequences of values. 

25 In another aspect, the present invention provides an apparatus arranged to 

generate a watermark signal for embedding in a multimedia signal, the apparatus comprising: 
(a) a sequence generator arranged to use a first sequence of values to generate a second 
sequence of values, the second sequence being a circularly shifted version of the first 
sequence; and (b) a signal generator arranged to generate a watermark signal by adding flie 

30 values of the first sequence to the respective values in the corresponding positions of the 
second sequence. 

Preferably, the apparatus fiirther comprises a signal conditioner arranged to 
convert each sequence of values into a smoothly varying signal. 



PHNL020240EPP 

4 26.03.2002 

Preferably, the apparatus is arranged to generate said first sequence of values 
by circularly shifting a primary sequence of values. 

In a further aspect, the present invention provides a method of embedding a 
watermark m a multimedia signal, the method comprising the steps of: (a) generatmg a 
5 watemiark signal equal to the sum of two sequences of values, the second sequence bemg a 
cHcularly shifted version of the first sequence of values; (b) generating a host modifying 
multimedia signal as a product of the watermark signal and the multimedia signal; (c) 
generating a watermariced multimedia signal by adding a scaled version of said host 

modifying multimedia signal to tiie multimedia signal. 

IQ Preferably, said scaled version of the host modifying signal is generated by 

controlling the scaling fector by a predetemiined cost-function. 

Preferably, said cost function comprises multiple scalmg factors, each scaling 
factor bemg defined separately for one or more of the pluraHty of fi^quency bands m the 
multimedia signal. 

J 5 Preferably, said fi«quency bands are determined according to a model of the 

human auditory and/or vis^' 

Prefera> • ; .• . • ; host modifying multimedia signal is generated by 
multiplymgsaidwatermc .^ ;ita an extracted portion ofthe multimedia signal. 

Preferably, said extracted portion of the multimedia signal is obtained by 
20 filtering at least a portion of the multimedia signal with respect to at least one of fi^quency, 
space and time. 

The mefliod preferably further comprises the steps of: (d) generating a second 
watennark signal equal to the sum of a thkd and a fourth sequences of values, the fourth 
sequence being a circularly shifted version of tiie Ihkd sequence of values; (e) extracting a 

25 second portion of the multimedia signal, the second portion being filtered such that it does 
not overlap witii said first portion; (f) generating a watermarked multimedia signal by adding 
the product of the second watermark signal and the second extracted portion of the 
multimedia signal to the watermarked multimedia signal. 

In another aspect the present invention provides an ^paratus arranged to 

30 embed a watermark signal m a multimedia signal, the apparatus comprising; (a) a watermark 
generator arranged to gaierate a signal equal to the sum of two sequences of values, the 
second sequence bang a circularly shifted version of the first sequence of values; (b) an 
output signal generator arranged to generate a watermarked multimedia signal by adding the 
product of the wateraiaik signal and the multimedia signal to the multimedia signal. 
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Preferably, the apparatus further comprises a signal extractor arranged to 
extract a first portion of the multimedia signal. 

In a fiirther aspect the present invention provides a multimedia signal 
comprising a watermark, wherein the original multimedia signal has been watermarked by 
modifying the temporal envelope of the original signal by the watermark, the watermark 
comprising the sum of a first and a second sequences of values, the second sequence of 
v^ch is a circularly shifted version of Hie first sequence. 

In another aspect the present invention provides a method of detecting a 
watermatk signal embedded in a multimedia signal, the method comprising the steps of: (a) 
receiving a multimedia signal that may potentially be watermarked by a watermark signal 
modifying the temporal envelope of the host multimedia signal; (b) extracting an estimate of 
the watermark firom said received signal; and (c) correlating the estimate of the wateimaik 
with a reference version of the watermark so as to determine whether the received signal was 
watermarked. 

Preferably, the watermark signal has a payload, and the method further 
cor;f r* ;es the step of determining the payload of the watermark. 

In a further aspect, the present invention provides a watermark detector 
arranged to detect whether a watermark signal is embedded within a mxxltimedia 
signal, the watermark detector comprising: (a) a receiver arranged to receive a multimedia 
signal that may potentially be watermarked by a watermark signal modifying the temporal 
envelope of the host multimedia signal; (b) an extractor aixanged to extract an estimate of the 
watermark fix>m said received signal; and (c) a correlator arranged to correlate the estimate of 
the watermark with a reference version of the watermark so as to determine whether the 
received signal was watermarked. 

Preferably, the apparatus further comprises a payload detector arranged to 
determine if a payload is present within said watermark and to determine the value of said 
payload. 

For a better imderstanding of the invention, and to show how embodiments of 
the same may be carried into effect, reference will now be made, by way of example, to the 
accompanying diagrammatic drawings in which: 

Figure 1 is a diagram illustrating a watermark embedding apparatus in 
accordance with an embodiment of the present invention; 
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Figure 2 shows a signal portion extraction filter H used in one preferred 

embodiment; 

Figures 3a and 3b show respectively the typical ampUtude and phase responses 
as a function of frequency of the filter H used in Fig. 2; 
5 Figure 4 shows the payload embedding and watermark conditioning stage; 

Figure 5 is a diagram illustrating the details of the watermark conditioning 
apparatus Ho of Fig. 4. including charts of the assodated signals at each stage; 

Figure 6a and 6b show two preferred alternative window sh^mg fimctions 
s(n^ in the form of respectively a raised cosine fimction and a bi-phase function; 
~ Figures 7a and 7b show respectively the fi^quency spectra for a watermarK 

sequence conditioned with a raised cosine and a bi-phase shaping window function; 

Figure 8 is a diagram iUustrating a watermark detector in accordance with an 

embodiment of the present invention; 

Figure 9 diagrammatically shows the whitening filter of Fig. 8. for use in 

15 conjunction with a raised cosine shaping window function; 

Figure 10 diagrammatically shows the whitening filter H^of Fig. 8, for v - 

conjunction with a bi-phase window shaping function; and 

Figurell showsatypicalshapeoftiiecorrelationfiinctionoulput^. 

correlator of the watermark detector shown in Fig. 8. 

Fig.lshowsablockdiagramof1heapparatusrequiredtoperformthedigital 
signal processing for embeddmg a multi-bit payload watermark Wc into a host signal x in 
accordance with a preferred embodiment to the present invention. 

A host signal x is provided at an input 12 of the apparatus. The host signal x is 
25 passed in the direction of output 14 via the adder 22. However, arepUcaof the host signalx 
(input 8) is spUt off in the direction of the multiplier 18, for carrying the watermark 
information. 

The watermark signal Wc is obtained from the payload embedder and 
watermark conditioning apparatus 6, and derived firom the watermark random sequence 
30 (input 4), which is input to the payload embedder and watermark conditioning apparatus. The 
multipUer 18 is utilized to calculate the product of the watermark signal w^and the replica 
audio signal x. The resulting product, is then passed via a gain controller 24 to tiie adder 
22. The gain controller 24 is used to amplify or attenuate the signal by a gain factor a. 
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The gain factor a controls the trade off between the audibility and the 
robustness of the watermark. It may be a constant, or variable in at least one of time, 
frequency and space. The apparatus in Fig. 1 shows that, when a is variable, it can be 
automatically adapted via a signal analyzing unit 26 based upon the properties of the host 
signal X. Preferably, the gain a is automatically adapted, so as to minimize the impact on the 
signal quality, according to a properly chosen perceptibility cost-function, such as a psycho- 
acoustic model of the himian auditory system (HAS). Such a model is, for instance, described 
in the paper by E.Zwicker, "Audio Engineering and Psychoacoustics: Matching signals to the 
final receiver, the Human Auditory System", Journal of the Audio Sigineering Society, Vol. 
39, pp. Vol.1 15-126, March 1991. 

In the following, an audio watermark is utilized, by way of example only, to 
describe this embodiment of the present invention. 

The resulting watermark audio signal y is then obtained at the ou^ut 14 of the 
embedding ^paratus 10 by adding an appropriately scaled version of the product of and x 
to the host signal: 

yLn]^xlnl + aWcin]x[n]. (1) 

Preferably, the watermark Wc is chosen sacu V ^ : . multiplied with x, it 
predominantiy modifies the short time ^velope ofx. 

Fig. 2 shows one preferred embodiment in which the input 8 to the multipli^ 
18 m Fig. 1 is obtained by filtering the replica of the host signal x using a filter in the 
filtering unit 15. If the filter output is denoted by x^, then according to this preferred 
embodiment, the watermarked signal is generated by adding an appropriately scaled version 
of the product of xi, and the watermark Wc to the host signal x. 

Let Xb be defined such that x^ = Jc- jc^ , and;;^ be defined such that 
y-yb +5^, then the watermarked signal;/ can be written as 

yln] = a + y^cin])xt[n] + Xt[n} . (2) 

and the envelope modulated portion of the watermarked signal j; is given as 

ytin} = (1 + WclnJ)xi,[n] (3) 



Preferably, as shown in Fig. 3, the filter His a linear phase band-pass filter 
characterized by its lower cut off firequency^i and upper cut off firequency ///. As can be seen 
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in Fig. 3b, the filter if has a linear phase response with respect to frequency/within the pass 
band (BW). Thus, v^en His a band-pass filter, xt and x, are the in-band and out-of-band 
components of llie host signal respectively. For optimum performance, it is preferable that 
the signals X* and x, areinphase.Thisisachievedby appropriately compensating for the 

5 phase distortion produced by filter H. 

In Fig. 4, the details of the payload embedder and watermark conditioning unit 
6 is shown. In this unit the watermark seed signal is converted into a multi-bit watermark 
signal Wc 

FjKtiy^&ate-length, praferaMy zero mean and uniformly distributed random 

10 sequence Ws is generated using a random number generator with an initial seed S. As will be 
appreciated later, it is preferable that this initial seed S is known to both the embedder and the 
detector, such that a copy of the watermark signal can be generated at the detector for 
comparison purposes. This results in the sequence of length 

f4) 

wAk] s [-1,1], for k=0.1,2. .... Lw-J 

Then the sequence w, is circularly shifted by the amounts di and using the 
15 circularly shifting units 30 tc ;om sequences w.,; and v.^ respectively. It will be 

appreciated that these two su,... . . .. are effectively a first sequence and a 

second sequence, with the second ssquence being circularly shifted wife respect to the first 
Each sequence w^,, i = 1,2, is subsequently multipUed with a respective sign bit r„ in the 
multiplying unit 40, where n = +1 or -1, the respective values of r, and ra remaining constant. 
20 and only changmg when the payload of Hie watermark is changed. Each sequence is then 
converted into a periodic, slowly varying narrow-band signal wt of length L^Ts by the 
v^atermark conditioning circuit 20 shown in Fig. 4. Finally, the slowly varying narrow-band 
signals wi and W2 are added with a relative delay Tr (where Tr<T^ to give the multi-bit 
payload watermark signal Wc TMs is achieved by first delaying the signal by the amount 
25 Tr using delaying unit 45 and subsequently by adding it to wi with the adding unit 50. 

Fig. 5 shows in more detail the watermark conditioning apparatus 20 used in 
the payload embedder and watermark conditioning e^aratus 6. Hie watermark random 
sequence is u^)ut to the conditioning apparatus 20. 

For convenience, the modification of only one of the sequences Wd, is shown 
30 m Fig. 5. but it will be ^predated that each of the sequences is modified in a similar mamier, 
with the results bemg added to obtain the watermark signal Wc. 
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As shown in Fig. 5, each watermark signal sequence Wdi[k], HI, 2 is applied to 
the input of a sample repeater 1 80. Chart 181 illustrates one of the possible sequences as a 
sequence of values of random numbers between +1 and -1, with the sequence being of length 
Zrw. The sample repeater repeats each value within the watermark random sequence times, 
so as to generate a pulse train signal of rectangular shape. Ts is referred to as the watermark 
symbol period and represents the span of the watermark symbol in the audio signal. Chart 
183 shows the results of the signal illustrated in chart 181 once it has passed through the 
sample repeater 180. 

A wmdow shaping jRmction sfnj, such as a raised cosine window, is then 
^plied to convert ihe rectangular pulse signals derived from Wdi and Wd2 into slowly varying 
signals wjfnj and wgfnj respectively. 

Chart 184 shows a typical raised cosine window shaping function, which is 
also of period 7i. 

The generated signals wjfnJ and W2fnJ are then added up with a relative delay 
Tr (where Tr<T^ to give the multi-bit payload watermark signal Wc[n] i.e. 
Wc[n\- ^ • ^V2[/2-r^] (5) 

The value of Tr is chosen such that the zero crossings ofw j match the 
maximum amplitude points of W2 and vice-versa. Thus, for a raised cosine window shaping 
function Tr^T^2, and for a bi-phase window shaping function Tr^T^4. For other window 
shaping functions, other values of Tr are possible. 

As will be appreciated by the below description, during detection the 
watermarked signal carrying Wc[n] will generate two correlation peaks that are separated by 
pL (as can be seen in Fig. 1 1). The value pL is part of the payload, and is defined as 

/7ZH^2-^i|xnod(^p^]] (6) 

In addition to extra information can be encoded by changing the relative 
signs of the embedded watermarks. 

In the detector, this is seen as a relative sign r^/g^ between the correlation 
peaks. It will be seen that Vsign can take four possible values, and may be defined as: 
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v*er«py=signCcZ^ and/^sigarcXV arerespectively estimates of the sign bits r; (input 80) 
and (input 90) of Fig. 4, and cXy and cl. are the values of the correlation peak 
corresponding to and respectively. The overaU watennark payloadpA, is then given 
as a combination of rsign aodpL: 
pLy,={rsisft,pI'). 

Hie maximum mformation (W, in number of bits, that can be carried by a 

watermark sequence ofleugdi Ly, is Hfatts-^a-bys 

, , (9) 

■* n 



25 



'max 



= 1^82(41^7^ bits 



m such a scheme, the payload is immune to relative offset between the 
embedder and flie detector, and also to possible time scale modifications. Hie window 
shapingfonctionhasbeenidentifiedasoneofthemainparametersthatcontiolsthe 

robustness and audibiUty behavior of the pr^t watermarking scheme. As illustrated i^. 

1 j„^.vr. \-, 



Figs. 6 a and b, two examples of possible window shaping functions are herein desert 
15 raised cosine function and a bi-phasefunctioiL 

It is preferable to use a bi-phase wmdow function instead of a raised cosine 
window function, so as to obtain a quasi DC-free watemiark signal. Tliis is illustmted in 
Figs 7 a and 7 b, which show the frequency spectra corresponding to a watermark sequence 
(in this case a sequence ofwM = {1,1,-1.1,-1.-1.}) conditioned with respectively a raised 
cosine and a bi-phase window shying function. As can be seen, the frequency spectrum for 
the raised cosine conditioned watermark sequence has a maximum at frequency/- 0, whilst 
the frequency spectrum for the bi-phase sh^ed watermark sequence has a minimum at/- 0 

i.e. it has very little DC component. 

Useful information is only contained in the non-DC component of the 
watermaric. Consequently, for the same added watermark energy, a watermark conditioned 
with the bi-phase window wiU carry more useful information than one conditioned by the 
raised cosine window. As a result, the bi-phase window offers superior audibiUty quality for 
the same robustness or, conversely, it allows a better robustness for the same audibiUty 
quality. 
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Such a bi-phase function could also be utilized as a window shaping function 
for other watermarking schemes. Jh other words, a bi-phase function could be applied to 
reduce the DC component of signals (such as a watermark) that are to be incorporated into 
another signal. 

Fig. 8 shows a block diagram of a watermark detector (200, 300, 400). The 
detector consists of three major stages: (a) the watemiark symbol extraction stage (200), (b) 
the buffering and interpolation stage (300), and (c) the correlation and decision stage (400). 

In the symbol extraction stage (200), the received watermarked signal y'[n] is 
processed to generate multiple (Nh) estimates of the watermarked sequence, which ai^ 
multiplexed into the signal We[TnJ. These estimates of the watermark sequence are required to 
resolve any time offset that may exist between the embedder and the detector, so that the 
watermark detector can synchronize to the watermark sequence inserted in the host signal. 

In the buff^ing and interpolation stage (300), these estimates are de- 
multiplexed into Nb separate buffers. An interpolation is subsequentiy applied to each buffer 
to resolve possible timescale modifications that may have occurred. For instance, a drift in 
sampling (clock) frequency may result in a stretch or shrink in"' . . liain signal (iie. 
the watermark may have been stretched or shrunk). 

In the correlation and decision stage (400), the . ^j^h buffer is c 

correlated with the reference watermark and the maximimi correlation peaks are compared 
against a threshold to determine the likelihood of whether the watermark is indeed embedded 
within the received signal j; '[n]. 

In order i» maximize the accuracy of the watermark detection, the watermark 
detection process is typically carried out over a length of received signal y '[n] that is 3 to 4 
times that of the watermark sequence length. Thus each watermark symbol to be detected can 
be constructed by taking the averages of several symbols. This averaging process is referred 
to as smoothing, and the number of times the averaging is done is referred to as the 
smoothing factor Sf Thus, the detection window lengdi Ld is the length of the audio segment 
(in number of samples) over which a watermark detection truth-value is reported. 
Consequentiy, Ld^s^^Ts, where Ts is the symbol period and the number of symbols 
within the watermark sequence. Typically, the length {L0 of each buffer 320 within the 
buffering and interpolation stage is Li,=SfLw. 

In the watermark symbol extraction stage 200 shown in Fig. 8, the incommg 
watermark signal y 'fnj is mput to the signal conditionmg filter Ht(210). This filter 210 is 
typically a band pass filter and has the same behavior as the corresponding filter H (15) 
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shown in Fig. 2. TTie output of the fQter H, is y 'i>[n], and assuming linearity within the 
transmission channel, it follows from equations (2) and (3): 

y\[n\f»y^n} = (X+cew^in\)x^\n\ ^^^^ 

Note that when no filter is used in the embedder (i.e., when H=l) then Hb in 
the detector can also be omitted, or it can still be included to unprove the detection 
performance. If Hb is omitted, then yb in equation (10) is replaced with y. The rest of the 

processing is the same. 

F^r oi-^r^ifir^tinn , it ^.snimed that there is perfect synchronism between the 



embedder and the detector (i.e. no offeet and no change in tunescale), and that the audio 
10 signal is divided into frames of length TV, and that y 'Un] is the n-th sample of the m-th 
frame of the filtered signal y \[n]. It should be noted that if there is not perfect synchronism 
between the embedder and the detector, then any deviation can be compensated for within the 
buffering and interpolation stage 300 utilizing techniques known to the skilled person e.g. 
iteratively searching throu^ all possible scale and offeet modifications until a best match is 
15 achieved. 

The energy iS/>n/c.. .v ^' . . . tb.ej/W«7fi^eis: 
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m-\=t\ybAmnt ^^^^ 



where S[n] is the same window shaping fimction used in the watermaric conditioning circuit 
of Fig. 5. A person skilled in the art will appreciate that equation 1 1 represents a matched 
filter receiver, and is the optimum receiver when the symbol period is perfectly synchronized. 
Not withstanding this feet, from now on, we set S[n]=l in order to simplify subsequent 
e]q)lanations. 

Combining this with equation 10, it follows that: 

E[m\»t\yU^f ^"^ll-^cc^eirnYiXtJnf (12) 
/t=0 «-o 

where WeM is the m-th extracted wateraiark symbol, and contains Nt time-multiplexed 
estimates of the embedded watermark sequences. Solving for We[m] in equation 12 and 
ignoring higher order terms of a, gives the following approximation: 
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(13) 



•111=0 



n=0 



In the watermark extraction stage 200 shown in Fig. 8, the ou^ut y 'b[n] of the 
fUter Hb is provided as an input to a frame divider 220, which divides the audio signal into 
frames of length T, i.e. into ^ with the energy calculating unit 230 then being used to 
calcuhite the «iergy corresponding to each of the framed signals as per equation (1 1). The 
output of this energy calculation unit 230 is then provided as an input to the whitening stage 
Hy, (240) v^ch performs the frmction shown in equation 1 3 so as to provide an output 
y»e[m]J. Alternative implementations (240A, 240B) of this whitening stage are illustrated in 
Figs. 9 and 10. 

It will be realized that the denominator of equation 13 contains a term that 
requhes knowledge of the host (original) signal x. As tiie signal x is not available to tiie 
detector, it means tiiat in order to calculate We[m] then the denominator of equation 13 must 
be estimated 

'- /- ' ' -escribed how such an estimation can be achieved for the two 
describee w v.'- : • jeping fimctions (the raised cosine window shaping frmction and the bi- 
phase window snaping frmction), but it will equally be appreciated tiiat the teaching could be 
extended to other window shaping frtnctions. 

In relation to the raised cosine window shapmg frmction shown in Fig. 6a, it 
has been realized that the audio envelope induced by the wratermark contributes 
predominantiy to tiie noisy part of the energy frmction E[m]. The slowly varying part (i.e. tiie 
low frequency conqwnents) is predominately due to the contiibution of tiie envelope of the 
original audio signal x. Thus, equation 13 may be approximated by: 



^^M^M ^ ll 

2a i^lowpass(£[m]) ) 



(14) 
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where "lowpass (.)" is a low pass filter fimction. Thus, it will be appreciated that the 
whitening filter for the raised cosine window sh^e in the function can be realized as 
diowninFig. 9. 

As can be seen, such a whitening filter (240A) comprises an input 242A 
5 for receiving the signal E[m]. A portion of this signal is then passed through the low pass 
filter 247A to produce a low pass filtered energy signal Ew[mh which in turn is provided as 
an input to flie calculation stage 248 A along with the fimction E[m]. The calculation stage 
248A then divides E[m] by E^pM to calculate the extracted watermark symbol We[m], 

When a bi-phase window fimction is employed in the watermark conditioning 
10 stageofiheembedder,adifferentapproachshouldbeutihzedtoestnnatelhec^^^^^ 

original audio, and hence to calculate WeCm]. 

It wiU be seen by examination of the bi-phase window fimction shown in Fig. 
6b that when the envelope of an audio frame is modulated with such a window fimction. the 
firlt and the second halves of the frame are scaled in opposite directions. In the detector, this 
1 5 property is utilized to estimate the envelope energy of the host signal x. 

Consequently, within the detector, the audio frame is first sub-divided into two 
halves. The energy fimctions corresponding to the first and second halve frames are hence 
given by 



^'g-^... r.il^ (15) 

and 



respectively. As the envelope of the original audio is modulated m opposite directions within 
the two sub-frames, the original audio envelope can be approximated as the mean olEi[m] 
and E2[fn]. 

Further, the instantaneous modulation value can be taken as the difference 
25 between these two ftoictions. Thus, for the bi-phase window fimction. the watermark We[m] 
can be approximated by: 



PHNLQ20240HPP ^ 



15 26.03.2002 

2a{Ei[m]+E2[ml J 

Consequently, the whitening filter H„ 240B for a bi-phase window shying 
function can be realized as shown in Fig. 10. Inputs 242B and 243B respectively receive the 
energy functions of the first and second halve fi:ames EjfmJ and E2fmJ. Each energy function 
is then spht up mto two, and provided to adders 245B and 246B which respectively calculate 
Ei[m]-E2[mJ, and Ejfm] + E2[m]. Both of these calculated functions are then passed to the 
calculating unit 248B which divides the value from adder 245B by the value fi»m 246B so as 
to calculate the watermark We[mJ, m accordance with equation 17. 

This output We[m] is then passed to the buffering and interpolation stage 300, 
where the signal is de-multiplexed by a de-multiplexer 310, bufifeied m buffers 320 of length 

so as to resolve any lack of synchronism between the embedder and the detector, and 
mterpolated within the interpolation unit 330 so as to compensate for any time scale 
modification between the embedder and the detector. Such compensation can utilize known 
techniques, and hence is not described in any more detail within tb:!. . - i\ •> 
As shown m Fig. 8, outputs (wdi, wdz ... Wdnh) fi^^cT ' • > i- ^. ~:age are. 
passed to the interpolation stage and, after mteipolation, the ou^uts (Wjy/v;os* ... wjnO of this 
stage, which correspond to the different estimates of the correctly re-scaled signal, are passed 
to the correlation and decision stage. If it is beUeved that no time scaling compensation is 
required, the values (wdi. Wd2, ... wW can be passed direcfly to die correlation and decision 
stage 400 i.e. the mteipolation stage 330 can be omitted from the apparatus. 

The correlator 410 calculates the correlation of each estimate wij,j=J. ...,Mb 
witii respect to the reference watermark sequence WsfkJ. Each respective correlation ou^ut 
corresponding to each estimate is then appUed to tiie maximum detection unit 420 which 
determines which two estimates provided the best fits for the circularly shifted versions wai 
and Wd2 of the reference watermark. The correlation values (the peak ampUtudes and 
positions) for tiiese estimate sequences are passed to the threshold detector and payload 
extractor unit 430. 

If flie interpolation stage is omitted, alternatively the correlator 410 calculates 
the correlation of each estimate wzy,y=i. ...,iV* with the reference watermark sequence w^fkj 
and the results are passed on for subsequent processing to flie units 420 and 430 as outlined 
in the above paragraph. 
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The threshold detector and payload extractor unit 430 may be utilized to 
extract the payload (e.g. information content) from the detected watermark signal. Once the 
umt has estimated the two correlation peaks cL, and cU that exceed the detection threshold, 
the distance pL between the peaks (as defined by equation (6)) is measured. Next, the signs 

and of the correlation peaks are determined, and hence r«g„ calculated from equation 
(7). The overall watermark payload may then be calculated usmg equation (8). 

For instance, it can be seen in Fig. U that pL is the relative distance between 
the two peaks. Both peaks are positive i.e. pi = +1, and ^ = +1. From equation (7), rs«g» = 3. 

Copspqnently. thfepay^oa^ pLw = <3, pl>- 

The reference watermark sequence Ws used wi&in the detector corresponds to 
(a possibly circularly shifted version of) the original watemiark sequence applied to the host 
signal. For instance, if the watermark signal was calculated using a random number generator 
with seed S within the embedder, then equally the detector can calculate the same random 
number sequence usmg the same random number generation algorithm and the same initial 
seed so as to determine the watemiark signal, Alternatively, the watermark signal originally 
^pUed in the embedder and utilized by tK - . . v -is a reference could simply be any , ,.. 

predetermined sequence. 

Fig. 11 shows atypical shapj . .rrelation function as output from the,^ , 

correlator 410. The horizontal scale shows the correlation delay (m temis of the sequence 
bins). The vertical scale on the left hand side (referred to as the confidence level c£) 
represents the value of the correlation peak normalized with respect to the standard deviation 
of the (typically normally distributed) correlation function. 

As can be seen, the typical correlation is relatively flat wi& respect to cL, and 
centered about cL = 0. However, the function contains two peaks, which are separated hy pL 
(see equation 6) and extend upwards to cL values that are above the detection threshold when 
a watermark is present. When the correlation peaks are negative, the above statement appUes 

to their absolute values. 

A horizontal line (shown in the Fig. as being set at cl = 8.7) represents the 
detection threshold. The detection threshold value controls the false alarm rate. 

Two kinds of felse alarms exist: the Mse positive rate, defined as the 

probability of detecting a watermark in non watermariced items, and the felse negative rate, 
which is defined as the probabiHty of not detecting a watemaaric in watermarked items. 
Generally, the requirement of the felse positive alarm is more stringent than that of the felse 
negative. The right hand side scale on Fig. 11 iUustrates the probability of a felse positive 
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alann/7. As can be seen, in the example shown, the probablHty of a false positive /7=/£r»2 is 
equivalent to the threshold cL = 8.7, whilst p = 10^^ 1$ equivalent to cX = 20. 

After each detection interval, the detector determines whether the original 
watermark is present or whether it is not present, and on this basis output a "yes" or a "no" 
decision. If desired, to improve this decision makmg process, a number of detection windows 
may be considered. In such an instance, the false positive probabiUty is a combination of the 
individual probabiUties for each detection window considered, dependent upon the desired 
criteria. For instance, it could be determined that if the correlation function has two peaks 
above a threshold of cX = 7 on any two out of three detection intervals, then the watermark is 
deemed to be present Obviously, such detection criteria can be altered depending upon the 
desned use of the watermark signal and to take into account factors such as the original 
quality of the host signal and how badly the signal is likely to be comqrted during normal 
transmission. 

It will be appreciated by the skilled person that various implementations not 
specifically described would be understood as falling within the scope of the present 

invCTtion. For inst-- ?<: 'vhilst only the functionality of the embedding and detecting 
apparatus ha- will be appreciated that the apparatus could be realized as a 

digital circuii, ^ , ^ . vixcuit, a computer program, or a combination thereof. 

Equally, whilst the above embodiment has been described with reference to an 
audio signal, it will be appreciated that the present invention can be ^pUed to other types of 
signal, for instance video and data signals. 

Within the specification it will be appreciated that the word "comprising" does 
not exclude other elements or steps, that "a" or "and" does exclude a pluraUly, and that a 
single processor or other unit may fulfil the functions of several means re-cited in the claims. 
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CLAIMS; 

2 8. 03; 2002 
@ 

1 • A method of g^erating a wateimark signal for embedding in a multimedia 

signal, the method comprising the steps of: 

(a) generating two sequences of values, the second sequence being a circularly shifted 
version of the first sequence; and 

(b) generating a watermark signal by adding the values of the first sequence to the respective 
values in the corresponding positions of the second sequence. 



2- A method as claimed in claim 1, wherein each value of the first and second 

sequences is represented by a pulse of width Ts so as to form rectangular wave signals. 

3* A method as claimed in claim 2, wherein in step (a) a windov "h?;^ r^mg 

function is applied to convert each of the rectangular pulse train signal: : . . 
smoothly varying signals, with the resulting smoothly varying signalc v ; r ^ . <:p (b) 
to form the watermark signal. 

4- A method as claimed in claim 1, wherein each one of said sequences of values 
is convolved with a window shaping fimction which has a width of at least Ts, so as to 
generate two smoothly varying signals, these smoothly varying signals being added together 
in step (b) so as to form the watermark signal. 

5- A method as claimed in claim 4, wherein said window shaping function has a 
band limited frequency behavior and a smooth temporal behavior. 

6- A method as claimed in claim 5, where the window shaping function has a 
symmetric or anti-symmetric temporal behavior. 

7- A method as claimed in claim 4, wherein said window shapmg function 
comprises at least one of a raised cosine function and a bi-phase function. 
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8. A melhod as claimed in claim 4, wherein the watermark signal is generated by 
the addition of the two smoothly varymg signals with a relative delay of Tr, where Tr < Ts- 

9. A method as claimed in claim 8, wherein Tr is chosen such that maximum 
5 ampUtude pomts of the first smoothly varying signal coincide with zero-crossings of the 

second smoothly varying signal, and vice-versa. 

10. Amethod as claimed in claim l.wherein said watermark signal has apayload 
that is encoded in the combination of said two sequences of values. 

10 

11. An apparatus arranged to generate a watermark signal for embedding in a 
multimedia signal, the apparatus comprising: (a) a sequence generator arranged to use a first 
sequence of values to generate a second sequence of values, the second sequence bemg a 
circularly shifted version of the first sequence; and (b) a signal generator arranged to generate 

15 awatermaricsignalby addingthevaluesofthefirstsequencetotherespectivevaluesinthe 

corresponding positions of the second sequenc- 

12. An ^>paratus as claimed in cL^^; ^y^^aratus further comprising a s^ 
conditioner arranged to convert each sequence of values into a smoothly varying signal. 

20 

13. Anapparatusasclaimedinclaimll,wherem4eapparatusisarrangedto 
generate said first sequence of values by chcularly shifting a primary sequence of values. 

14. A method of embeddmg a watermaric in a multimedia signal, the method 

25 comprising the steps of: (a) genemtmg a watermark signal equal to the sum of two sequences 
of values, the second sequence being a circularly shifted version of the first sequence of 
values; (b) generating a host modifying multimedia signal as a product of the watermark 
signal and the multimedia signal; (c) generating a watermarked multimedia signal by adding 
a scaled version of said host modifying multimedia signal to tiie multimedia signal. 
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15. A method as claimed m clahn 14, wherein said scaled version of tiie host 

modifying signal is generated by controllmg the scalmg fector by a predetermined cost- 
function. 
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16. A method as claimed in claim 15, wherein said cost function comprises 
multiple scaling factors, each scaling factor being defined separately for one or more of the 
plurality of fi?equency bands in the multimedia signal. 

17. A method as claimed in claim 16, wherein said firequency bands are 
determined according to a model of the human auditory and/or visual system. 

18. A method as claimed in claim 14, wherein in step (b) said host modifying 
multimedia signal is generated by multiplying said watermark signal with an extracted 
portion of the mvdtimedia signal. 

19. A method as claimed in claim 1 8, wherein said extracted portion of the 
multimedia signal is obtained by filtering at least a portion of the mialtimedia signal with 
respect to at least one of firequency, space and time, 

20. A met^'' claimed in claim 14, wherein the method further comprises thb 
steps of: (d) gen-: -/atermark signal equal to the sum of a third and a fourth 
sequences of value;?. ^ : : v r^b sequence being a circularly shifted version of the third • 
sequence of values; (e) extracting a second portion of the multimedia signal, the second 
portion being filtered such that it does not overlap with said first portion; (f) generating a 
watermarked multimedia signal by adding the product of tiie second watermark signal and 
&e second extracted portion of ifae multimedia signal to the watermarked multimedia signal. 

21. An apparatus arranged to embed a watermark signal in an multimedia signal, 
the apparatus comprising; (a) a watermark generator arranged to generate a signal equal to 
the sum of two sequences of values, the second sequence being a circularly shifted version of 
the first sequence of values; (b) an output signal generator arranged to generate a 
watermarked multimedia signal by adding the product of the watermark signal and the 
multimedia signal to the multimedia signal. 

22. An apparatus as claimed in claim 21, wherein the apparatus further comprises 
a signal extractor arranged to extract a first portion of the multimedia signal. 
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23. A multimedia signal comprising a watermark, wherein the original multimedia 
signal has been watermarked by modifying the temporal envelope of the original signal by 
the watermark, the watermark comprising the sum of a first and a second sequences of 
values, the second sequence of which is a circularly shifted version of the first sequence. 

5 

24. A method of detecting a watermark signal embedded in a multimedia signal, 
the method comprising the steps of: (a) receiving a multimedia signal that may potentiaUy be 
watermarked by a watermark signal modifying the temporal envelope of the host multimedia 
signal: (V) extracting an estimate of the watermark Scorn said received signal; and (c) 

10 correlating the estimate of the watermark with a reference version of the watermark so as to 
determine whether the received signal is watermarked. 

25. A method as claimed in claim 24, further comprising the step of applying a 
window shaping fimction to said received signal. 

A method as clahned in claim 24, wherein the watermark signal has a payload, 
, ; method fiirther comprises the step ofdetermining the payload of the watermark. 

27. A watermark detector apparatus arranged to detect whether a watermark signal 
20 is embedded within a multimedia signal, the watermark detector comprising: (a) a receiver 

arranged to receive a multimedia signal that may potentially be watermarked by a watermark 
signal modifying the temporal envelope of the host multimedia signal; (b) an extractor 
arranged to extract an estimate of tiie watermark firom said received signal; and (c) a 
correlator arranged to correlate the estimate of the watermark with a reference version of tiie 
25 wateraiark so as to determme whether the received signal was watermarked. 

28. An apparatus as claimed in claim 27. wherein tiie apparatus further comprises 
a detector arranged to determine if a payload is present witiiin said watermark and to 
determine the value of said payload. 

30 

29. A computer program arranged to perform at least one of tiie method of claim 1 
and the method of claim 24. 



15 



30. 



A record carrier comprising a computer program as clauned in claim 
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31, A method of rnaJdng available for downloading a computer program as 

claimed in claim 29. 
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ABSTRACT: 2 8. 03. 2002 



A method of generating a watermaik signal, embedding the watennark signal 
within a multimedia signal, and subsequently detecting the watermark signal is described. 
The watermark signal is the sum of two sequences of values, the second sequence of values 
being a circularly shifted version of the first sequence. 
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